{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Suppose at $t$-step, $\\theta = \\theta^*$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "M = \\sum_i^m\\sum_{z^{(i)}} Q_i^*(z^{(i)}) \\log \\frac{p(x^{(i)}, z^{(i)}; \\theta)}{Q^*(z^{(i)})}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "where\n",
    "\n",
    "$$\n",
    "Q_i^*(z^{(i)}) = p(z^{(i)} | x^{(i)}; \\theta^*) = \\frac{p(x^{(i)}, z^{(i)}; \\theta^*)}{p(x^{(i)}; \\theta^*)}\n",
    "$$\n",
    "\n",
    "for each $i$, following the E-step. Note $Q_i^*(z^{(i)})$ is a constant indepedent of $\\theta$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Take derivative of $M$ over $\\theta$\n",
    "\n",
    "\\begin{align*}\n",
    "\\nabla_{\\theta}{M} \n",
    "&= \\sum_i^m\\sum_{z^{(i)}} Q_i^*(z^{(i)}) \\frac{Q_i^*(z^{(i)})}{p(x^{(i)}, z^{(i)};\\theta)} \\frac{1}{Q_i^*(z^{(i)})} \\nabla_{\\theta}p(x^{(i)}, z^{(i)};\\theta) \\\\\n",
    "&= \\sum_i^m\\sum_{z^{(i)}} \\frac{p(x^{(i)}, z^{(i)}; \\theta^*)}{p(x^{(i)}; \\theta^*)} \\frac{\\nabla_{\\theta}p(x^{(i)}, z^{(i)};\\theta)}{p(x^{(i)}, z^{(i)};\\theta)} \\\\\n",
    "\\end{align*}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since $\\theta$ has converged to $\\theta^*$, setting $\\theta = \\theta^*$ will make $\\nabla_{\\theta}M = 0$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\\begin{align*}\n",
    "\\nabla_{\\theta}{M}|_{\\theta = \\theta^*} \n",
    "&= \\sum_i^m\\sum_{z^{(i)}} \\frac{p(x^{(i)}, z^{(i)}; \\theta^*)}{p(x^{(i)}; \\theta^*)}\\frac{\\nabla_{\\theta}p(x^{(i)}, z^{(i)};\\theta)|_{\\theta = \\theta^*}}{p(x^{(i)}, z^{(i)};\\theta^*)} \\\\\n",
    "&= \\sum_i^m\\sum_{z^{(i)}} \\frac{\\nabla_{\\theta}p(x^{(i)}, z^{(i)};\\theta)|_{\\theta = \\theta^*}}{p(x^{(i)}; \\theta^*)} \\\\\n",
    "&= \\sum_i^m \\frac{\\nabla_{\\theta}p(x^{(i)}, \\theta)|_{\\theta = \\theta^*}}{p(x^{(i)}; \\theta^*)} \\\\\n",
    "&= \\sum_i^m \\nabla_{\\theta} \\log p(x^{(i)}; \\theta)|_{\\theta = \\theta^*} \\\\\n",
    "&= \\nabla_{\\theta}  \\sum_i^m \\log p(x^{(i)}; \\theta)|_{\\theta = \\theta^*} \\\\\n",
    "&= \\nabla_{\\theta} \\ell(\\theta) \\\\\n",
    "&= 0\n",
    "\\end{align*}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Therefore, upon convergence, Letting $\\nabla_{\\theta}M = 0$ is equivalent to letting $\\nabla_{\\theta} \\ell = 0$."
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python [default]",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
